Goto

Collaborating Authors

 source policy


CUP: Critic-Guided Policy Reuse

Neural Information Processing Systems

The ability to reuse previous policies is an important aspect of human intelligence. To achieve efficient policy reuse, a Deep Reinforcement Learning (DRL) agent needs to decide when to reuse and which source policies to reuse. Previous methods solve this problem by introducing extra components to the underlying algorithm, such as hierarchical high-level policies over source policies, or estimations of source policies' value functions on the target task. However, training these components induces either optimization non-stationarity or heavy sampling cost, significantly impairing the effectiveness of transfer. To tackle this problem, we propose a novel policy reuse algorithm called Critic-gUided Policy reuse (CUP), which avoids training any extra components and efficiently reuses source policies.






Checklist 1. For all authors (a)

Neural Information Processing Systems

Do the main claims made in the abstract and introduction accurately reflect the paper's Did you discuss any potential negative societal impacts of your work? Did you state the full set of assumptions of all theoretical results? Did you include complete proofs of all theoretical results? Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Y es] See the Did you specify all the training details (e.g., data splits, hyperparameters, how they Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)? Did you include the total amount of compute and the type of resources used (e.g., type Did you include any new assets either in the supplemental material or as a URL? [N/A] Did you discuss whether and how consent was obtained from people whose data you're If you used crowdsourcing or conducted research with human subjects... (a) We believe policy reuse serves as a promising way to transfer knowledge among AI agents.




FAST: Similarity-based Knowledge Transfer for Efficient Policy Learning

Capurso, Alessandro, Piccoli, Elia, Bacciu, Davide

arXiv.org Artificial Intelligence

--Transfer Learning (TL) offers the potential to accelerate learning by transferring knowledge across tasks. However, it faces critical challenges such as negative transfer, domain adaptation and inefficiency in selecting solid source policies. These issues often represent critical problems in evolving domains, i.e. game development, where scenarios transform and agents must adapt. The continuous release of new agents is costly and inefficient. In this work we challenge the key issues in TL to improve knowledge transfer, agents performance across tasks and reduce computational costs. The proposed methodology, called F AST - Framework for Adaptive Similarity-based Transfer, leverages visual frames and textual descriptions to create a latent representation of tasks dynamics, that is exploited to estimate similarity between environments. The similarity scores guides our method in choosing candidate policies from which transfer abilities to simplify learning of novel tasks. Experimental results, over multiple racing tracks, demonstrate that F AST achieves competitive final performance compared to learning-from-scratch methods while requiring significantly less training steps. Learning is often thought of as a process rooted in interactions with the environment. Reinforcement Learning (RL) expands on this core concept by viewing learning as a trial-and error process, in which agents engage with the environment, make choices, and receive feedback in the form of reward or penalties. Traditionally, agents are trained from scratch to accomplish a single task, requiring extensive interactions with the environment to achieve proficiency far more than a human would need for comparable tasks. One primary challenge in RL is the substantial computational demands imposed by simulation, where training time and data requirements scale up for complex tasks. In game development and other evolving environments it is expensive and sub-optimal to start at each iteration from zero.


SRSA: Skill Retrieval and Adaptation for Robotic Assembly Tasks

Guo, Yijie, Tang, Bingjie, Akinola, Iretiayo, Fox, Dieter, Gupta, Abhishek, Narang, Yashraj

arXiv.org Artificial Intelligence

Enabling robots to learn novel tasks in a data-efficient manner is a long-standing challenge. Common strategies involve carefully leveraging prior experiences, especially transition data collected on related tasks. Although much progress has been made for general pick-and-place manipulation, far fewer studies have investigated contact-rich assembly tasks, where precise control is essential. We introduce SRSA (Skill Retrieval and Skill Adaptation), a novel framework designed to address this problem by utilizing a pre-existing skill library containing policies for diverse assembly tasks. The challenge lies in identifying which skill from the library is most relevant for fine-tuning on a new task. Our key hypothesis is that skills showing higher zero-shot success rates on a new task are better suited for rapid and effective fine-tuning on that task. To this end, we propose to predict the transfer success for all skills in the skill library on a novel task, and then use this prediction to guide the skill retrieval process. We establish a framework that jointly captures features of object geometry, physical dynamics, and expert actions to represent the tasks, allowing us to efficiently learn the transfer success predictor. Extensive experiments demonstrate that SRSA significantly outperforms the leading baseline. When retrieving and fine-tuning skills on unseen tasks, SRSA achieves a 19% relative improvement in success rate, exhibits 2.6x lower standard deviation across random seeds, and requires 2.4x fewer transition samples to reach a satisfactory success rate, compared to the baseline. Furthermore, policies trained with SRSA in simulation achieve a 90% mean success rate when deployed in the real world. Please visit our project webpage https://srsa2024.github.io/.